PEDe reiki Os OF eT be. 


For Reference 


NOT TO BE TAKEN FROM THIS ROOM 


‘Porter eenn. 


niversity of Alberta 
rinting Department 


For Reference 


NOT TO BE TAKEN FROM THIS ROOM 


Ex LIBRIS 
UNINEASTIACES 
AMPERTALASIS 


A GENERALIZATION OF A TEST OF COPLANARITY 


BASED ON THE FISHER DISTRIBUTION 


BY 


JOHN HUBERT 


A THESIS 


SUBMITTED TO THE FACULTY OF GRADUATE STUDIES 
IN. PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE 


OF MASTER OF SCIENCE 


DEPARTMENT OF MATHEMATICS 
UNIVERSITY OF ALBERTA 


EDMONTON, ALBERTA 


APRIL, 1967 


UNIVERSITY OF ALBERTA 


FACULTY OF GRADUATE STUDIES 


The undersigned certify that they have read, and 
recommend to the Faculty of Graduate Studies for acceptance, 
a thesis entitled "A GENERALIZATION OF A TEST OF COPLANARITY 
BASED ON THE FISHER DISTRIBUTION", submitted by JOHN HUBERT 
in partial fulfilment of the requirements for the degree of 


Master of Science. 


(i) 


ABSTRACT 


Fisher (1953) derived a distribution of directions 
where the field of possible observations was the surface of the 
unit sphere. The distribution of the elementary errors over this 
surface had a frequency density proportional to Oe ec e where 
6 is the angle between an observed direction and a 'true' 
direction, where @ =O the density is a maximum and where k 
is a parameter, Watson (1960) derived a test of whether the 
mean directions of a set of populations each distributed according 
to the Fisher distribution all lie in the same plane normal (i.e., 
a = 1/2 ) to a prescribed direction. In the present thesis, a 
generalization of this test of coplanarity to any angle a is 
derived. A thorough account of the history, development and 
application of the Fisher distribution is given, A summary of 


other significance tests based on this distribution is also 


provided. 
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CHAPTER I 


INTRODUCTION 


The spherical normal distribution on the unit sphere was 
derived in 1954 by R. A. Fisher [27] in connection with the direction of 
remanent magnetization of specimens from the same source of rock. In 
the present thesis this distribution is called the 'Fisher distributipn'. 
Fisher has also derived most of the basic distribution theory. The 
specific test of whether the mean directions of a set of populations, 
distributed according to the Fisher distribution, all lie in the same 
plane normal to a prescribed direction was derived in 1960 by G. S. 
Watson [96]. This significance test is called the 'test of coplanarity' 
and this thesis is primarily concerned with the generalization of this 


test and reviewing the history of the theory associated with it. 


In Chapter II a comprehensive review of the distributions 
associated with measurements of directions is given, and their relation- 
ships with the Fisher distribution are demonstrated. Also the Gaussian 
normal distribution, the von Mises distribution and the Fisher distribution 


are derived in detail. Many references of their applicability are given, 


A variety of significance tests of hypotheses based on the 
Fisher distribution are discussed in Chapter III. The necessary density 


functions of the required statistics for these tests are also provided. 
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The Watson test of coplanarity for the sphere and its general- 


ization are derived in detail in Chapter IV, 


A list of references completes the thesis. 
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CHAPTER II 


A REVIEW OF THE EARLIER WORK IN THE 


re ee ee 


Gauss [29] and Laplace [47] considered measurements of the 
direction of celestial bodies with respect to the earth. Because of 
inescapable errors of observations, repeated measurements of the same 
direction did not coincide but rather tended to deviate about a central 
location, generally in a very small radius. To cope with this variation, 


Gauss developed what is known today as the Theory of Errors. 


The Gaussian theory assumes that the measurements may be 
approximated linearly. In recent years there appear physical quantities 
which cannot be usefully approximated linearly and in fact only the 
orientation of the measured quantities or variables is of interest. 
These variables may be regarded as vectors from an origin, ending in 
points on the circumference of a sphere, where, since the length of the 
vector is unimportant, the radius is taken as 1. Then the variables of 
importance are the usual polar coordinates @ and 9Q% for the sphere 
and @ for the circle. The statistical examination of such variables 


is known as the statistics of directions. 


The distributions involving directions can be divided into four 
basic categories: the uniform distribution, the Brownian motion distribu- 


tion, the von Mises distribution, and the Fisher distribution. This 
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chapter will review their development and show how these four distribu- 


tions are interrelated. 


Sa. The Uniform Distribution, 


The most extensively studied distribution of directions is 
undoubtedly the uniform distribution, known also as the isotropic dis- 
tribution by physicists and astronomers. The first consideration of 
this distribution was in 1734 by Daniel Bernoulli [4] who questioned 
whether the close coincidence of the orbital planes of the then known 
six planets could have arisen by chance. Since each orbit determines 
a directed line, namely the line normal to the plane and directed in 
the sense that the motion of the planet is counter-clockwise around it, 
each orbital plane corresponds to a point on the sphere in the following 
sense: since a line through a point meets the unit sphere centred there 
at exactly two points and if it is a directed line these points can be 
distinguished, then there is a one-to-one mapping of the set of directed 
axes onto the unit sphere. Hence a distribution defined on the surface 
of the sphere defines a distribution of directed lines in space and 


conversely. 


This problem was interpreted as a problem of testing the agree- 
ment with the uniform distribution on the sphere, and, in fact, Bernoulli 
applied a test of significance which rejected the uniform distribution 


on the evidence of observed orbits. 


The uniform distribution occurs next in the literature with 
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apparently two different problems, but as will be seen later, the 


problems are very similar. In 1905, Karl Pearson [58] proposed the 
following problem: 
"A man starts from a point O and walks fg yards 
in a straight line; he then turns through any angle 
whatever, and walks another f yards in a second 
straight line. He repeats this process n_ times, 
I require the probability that after these n 


stretches he is at a distance between r_ and 


r +dr from his starting point 90 ." 


The theory of 'random walk' (see Spitzer [80]) and the diversified 
problem of ‘random flights' (see Rayleigh [69]) have their origin from 


this inquiry. 


Essentially, Pearson was assuming a sample of n random 


angles a, (i= 1,2,...,n) from the uniform distribution 


i 
(2.1.1) fice 
° 2 Ost 9 
and his required probability is 
(2.1.2) dP (x, 4) 


where P (4) denotes the probability that the man is at most a dis- 


tance r after n stretches of £ yards each from the point O. 


If we define the probability density function £ (4) by 


the relation 


d 
f(t, 4) = dr P(r, 4) 
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or equivalently 


r 
P(r, 4) -[ f(x, £)dx 5 
O 
then 


d 
f(r, 4dr = P(r, 4dr 


represents the probability that the distance between O and the man 


is between r and r+dr, 


Within a year three solutions were proposed. The first was 
by Rayleigh [68] who stated that 


"This problem ... is the same as that of the composition 


of n iso-periodic vibrations of unit amplitude and 


of phases distributed at random ... ." 


He actually considered this problem in 1880 (see [67]). Rayleigh provided 
only an asymptotic approximation to the solution for large n in two 
dimensions. His solution was 
2 
» eyes 
er nge ) 
tesa) f(r, 4) = a e . 
nf 


Pearson [59] acknowledged this solution to be correct only when n is 
infinite (hence the subscript ~), and mentioned that it gives a close 
approximation for large, finite values of n over the range O<r<n, 
(The frequency must necessarily be zero for values of r greater than 


ng .) 


The second solution was given by Pearson [60] himself. (His 
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results were published in a rather inaccessible journal and we state 
his solution in a form given in Horner [38] and in Durand and Greenwood 


[17].) Pearson's solution was 


2 
eA Bes 
n ae) 2 
ere r 
(2.1.4) é (r,1) = #2 — aa cL, () 


2 
where the L, (=) are Laguerre polynomials (as defined in Watson [93]) 


and the ec are as follows: 


i 
Co = | 
Cy = O 
-1 
aS Peer 
-2 
45c_=—> 
3 oan 
6n - ll 
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4 8n> 
5! c_ = 20n_ - 57 
e 5 <= hy 
15n 
*) 
_ -(1892 - 2125n + 270n ) 
63 te 
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(2.1.4) is an infinite series from which the probability density function 
£ (x51) could be calculated. The accuracy can be found by taking a finite 
number of terms of this series representation and this problem has been 
analyzed extensively by Lord [54], Greenwood and Durand [31], Durand and 


Greenwood [17], Esseen [22], and Horner [38]. 


The third solution was given concisely by Kluyver [45] who 
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stated: 


"I find that the general solution of this problem 


depends upon the theory of Bessel's function ...". 


Essentially, Kluyver showed that 
oo 

(2.1.5) P(r,2) = i (5 (sx)]" 5, (xx)dx , 
fe) 


where J, (x) denotes the Bessel function as defined in Watson [93]. 


Since 


< [rJ, (rx)] = rxJ (rx) 


then 


f(r, 4) =f i: [J (#x)1" xJ (rx) dx : 


Kluyver also showed that 


1 
P (1: 1) = n+l ° 


This integral solution (2.1.5) for the two-dimensional case 


was subsequently extended to three dimensions in 1919 by Rayleigh [69]. 
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He also derived the result 


00 
r 2 sin rx-rx cos rx ,sin #x\n 
(2.1.6) P(r, 2) “of Sin TeCrH cOS EE (SED BM ay 


Moreover, in a subsequent paper, Rayleigh [70] derived asymptotic 


expansions for the linear and circular cases of this problem. 


These solutions derived by Rayleigh in 1880 and 1919 were 
generated by investigations of problems in the theory of sound. With 
the advancement of telecommunication theory in the 1940's further 
applications of Rayleigh's solution appeared. In particular, Slack [77] 
applied Rayleigh's solution to derive some results for problems involved 
in the theory of multi-channel transmission for telephone systems, 
Horner [38] applied Rayleigh's solution to theoretical investigations 
for site error problems in direction finding. Chandrasekhar [10] gave 
a thorough survey of how Rayleigh's solution can be applied to many 


complicated problems in physics and astronomy. 


The three basic solutions given by Rayleigh, Pearson and 
Kluyver all deal with the distribution of the length r , where 
O<r<™, of the resultant of unit vectors whose directions are 
uniformly distributed either in one or two dimensions and all are 
approximate in form, The first to derive an exact form for the 
distribution of r in three dimensions was Fisher [27] in 1953. This 
form of the distribution will be discussed separately in section 2.4 


and is called the Fisher distribution. 
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In two dimensions Lévy [51] showed that the addition of random 
variables on the circle can lead to the uniform distribution. He also 
stated that this holds for three dimensions but only for the special 
case when the variables are distributed with axial symmetry. Perrin [61] 
showed that for this special case this result is a direct consequence 
of the convolution property of the Legendre polynomial series (see 
Breitenberger [6] and Roberts and Ursell [71]). Breitenberger [7] 
states that at present no analytical investigations have been made for 


this case, 


In recent years there have appeared many papers on the uniform 
distribution on the circle and on the sphere. Percentage points, tables 
and their application to statistics connected with the uniform distribu- 
tion have already been published (see Stephens [89]). A thorough review 
of this distribution can be found in a series of papers by Durand and 


Greenwood [16], [17], [18], [30], [31]. 


2.2 The Brownian Motion Distribution 


Another type of distribution of direction defined on the 
circle or the sphere which has received extensive study throughout the 
years is the ‘heat flow distribution' or ‘diffusion with a point source! 
or the 'Brownian motion distribution'. It concerns the addition of 
infinitely many variables with infinitesimal range.. The straight- 
line case was first encountered in 1900 by Bachelier [2]. The Brownian 
motion distribution on the circle, first derived by deHaas-Lorentz [ 34] 


in 1913, leads to the 'wrapped-up normal distribution’. Basically it 
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is obtained by identifying points on the line which are equivalent 
mod 2x and summing the normal density over these points. The density 


function can be expressed as 


-+- 00 2 
1 -(9-0 -2 jx) 
(2.2.1) £(6) = ———; exp .| #——s-——-_ |, 0 <-8 < 2x 
(2 a)? ‘ 2 
1 j=-© “ c 


(Gumbel, Greenwood and Durand [%33]). Bingham [5] stated that (2.2.1) 


could be written in the form 


_a* 
(0) =e Us ((0-9,), 2 as 


where D 3 is a Jacobi theta function (see pera [21]). The density 
(2.2.1) was later studied by many authors: Polya [64]; Perrin [61], 
[62]; Wintner [104], [106]; face [49], [50], [51]: Marcinkiewicz [55]; 
Zernike, [107]; Sommerfeld [78]. It has been shown by Stephens [84] 
that (2.2.1) can be closely approximated by the von Mises distribution 
which will be discussed later. Hence it can be expected that the theory 
and methods for the von Mises distribution will be approximately correct 


for the Brownian motion distribution on the circle. 


On the sphere, the Brownian motion distribution was first 
studied by Einstein [19] in 1905. It is the distribution, after a 
fixed length of time, of the position of a particle starting from a 
fixed point and moving with infinitesimal randomly oriented steps. The 


density function can be expressed as 
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2 
(2.2.2) i y (4n+1)P. (cos @) e2n(2n+1 )o 3 o> 0 
t en 
n=o 

where @ is the spherical distance from a fixed direction (the starting 
point of the moving particle), and (j = 0,1,...) are the Legendre 
[48] polynomials (see Hobson [37]). In 1960 Roberts and Ursell [71] 
showed that (2.2.2) could be approximated by the Fisher distribution for 


a suitable choice of parameters. 


In 1921 Wiener [103] used a different method to derive this 
distribution. He used functional analysis to derive this distribution 
and to prove that it was approximately normal. The proof is based on the 
theory of convolutions and the geometrical argument that the equi- 
distribution on the surface of a n-dimensional sphere can be projected 
orthogonally on a diameter of this sphere. This approach was used by 
Perrin [61] in 1928 for the case of the circle and was followed in a 
general and precise manner for the sphere by Lévy [50] in 1938. Many 
followers extended Lévy's basic work. Some of these are: Schoenberg 
[75]; Kac [42]; Hartman and Wintner [36]; Wintner [106]; Furry [28]; 
Favro [24]. The latter two stressed mostly the formal aspects at the 
expense of further distribution theory, (For further comments, see 
Chandrasekhar [10], Coulson [11], Quenouille [66], Breitenberger [7], 
Stephens [81], and Bingham [5]. In Chapter 2 of [10] there is a 
thorough review of the development of the Brownian motion theory from 
1905 to 1943, and his bibliography should be referred to for further 


references). 
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Today the Theory of Brownian Motion is an extensive field in 


statistical mechanics and the number of books and articles on it is immense. 


Ee The von Mises Distribution. 


As remarked earlier, it was in 1809 that Gauss developed the 
theory of errors. His development was in relation to the infinite linear 
continuum Bot the actual topological framework (for example, the surface 
of the sphere) of such measurements is ignored with a certain gain in 
simplicity. In 1918 von Mises [56] applied the method of Gauss to a 
circular variate and derived the distribution known as the ‘circular 
normal distribution on the circle' or 'the von Mises distribution’. 

In 1953, R. A. Fisher [27] considered how the theory of errors would 
have had to be developed if the observations had in fact involved errors 


so large that the actual topology had to be taken into account. In fact 


he stated: 


"If astronomy had involved measurements of direction 
of such poor accuracy that they were scattered over 
a large part of a celestial sphere, the distribution 
theory of directions would have followed different 


lines." 


We shall discuss the Fisher distribution in the next section where it 
will be evident that the form of the von Mises distribution suggested 


to Fisher his original density for the sphere. 


Before discussing the von Mises distribution a discussion of 


the Gaussian normal law of errors and the circular variate will be given. 
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The former is reviewed here because the von Mises derivation follows 
the same methods of approach as Gauss' derivation. The latter is 
discussed because it is to the circular variate, not the linear variate, 


that von Mises applied Gauss' methods. 


my ek The Gaussian Normal Law of Errors. 


The normal law of errors was developed in 1809 by Gauss [29]. 


It states that: 


" .. the distribution of the measures about the true 


value is a [linear] normal frequency distribution." 


(See Whittaker and Robinson [102], page 220.) Gauss used the well-known 
maximum likelihood principle of Legendre [48] and the single assumption 


called the ‘postulate of the arithmetic mean';: 


",.. when any number of equally likely good direct 


observations of an unknown magnitude are given, the 


most probable value is their arithmetic mean." 


(See Whittaker and Robinson [102], page 215.) The derivation is essentially 
as follows: Suppose that for the measure of an observed quantity the 
probability of an error between A and A+d is f(A) dA so that 

f(A) is the relative frequency of error. If e¢ denotes the least 

quantity to which the measuring instrument is sensitive, we can suppose 

that the possible values of any measure proceed by steps of amount € 


and the probability of an error A may be taken as f(A)e . 
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x denote n equally likely observations 
(measurements) of a quantity x whose true value is Xo? then the errors 
of the observations are A, =x, - x Pigeel,2;.c.;0)6 poince, the 


probability of the error in the ith measurement is £ (x, - x,)¢ , then 


the probability that a set of measures (x, Koy sey x) will occur is 


e" all £(x, - x.) : 


i=} 


If we assume that all values of x are equally likely to be 
the true value before the observations are made then when the observations 
have been made the probability that the true value of x _ lies between 


x and x + dx is 
fe) fe) fe) 


n 


adh £(x, - x) dx, 
Ls 
400 Nn 
/ I £(cmiee Sap) dx 
; i oo % 
-o0 j=l 


Therefore the 'most probable' value of the true value of x is that 


x which makes 


n 
f(x, - x) 
dl 2 


a maximum, or equivalently that x for which 


n 
Pao ns 1h) ? < gn f(x, -x)=0O . 
i=l 


But by the assumption that the mean is the most probable value of x , 
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(2.3.1.1) is equivalent to 


LG, 


(OAFs 1.2) a (x, - x) = 0 


Since (2.3.1.1) and (2.3.1.2) are equivalent, then for some constant 


c, we have 


ax / £(x, - x) = c(x, —ae 
i.e. 
“3 (x, - x)" 
£ (x, -x)=Ae : 
where A is aconstant. Thus since A = x5 - x 
Ae 
f(A) =Ae ; 


Since the sum of the probabilities of all possible errors is unity, 
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where h is a constant. Equation (2.3.1.3) is a normal density 
function so that the distribution of measures about the true value is 


a normal frequency distribution. 


i ea The Circular Variate. 


As we have remarked before, von Mises [56] applied the above 
approach of Gauss to a circular variate. The nature of a circular variate 


can be better understood by the following considerations; 


If n events occur during a span of time (e.g. year) and if 
each event occurs at a certain time, and if each date is considered as 
a chance variate s then the distribution of events over the time period 
may be considered a circular distribution of an angle by situating the 
n events ps eeey Om on the circumference of a unit circle so that 
the a, are the angles of the radii drawn from the centre. To 


characterize such circular observations Gis Gos seey Gy introduce 


the rectangular coordinates 
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define the means. In polar coordinates, let a and a, be the 


solutions of the equations: 


© | 


x = cos @ 
oO 


% | 


= sin a 
y fe) 


This solution is unique unless x = y = 0 , and is given by 


(295 7271) Pe ae eS 


a = arctan 
oO 


xIf<I 


where the quadrant in which oe lies must be determined by inspection 
of the signs of x and y . Equation (2.3.2.1) defines the vector 


length a at the mean direction a Clearly 


F(T amy e (Dey 
a 2S sin a, + cos a, 

: i=l ‘ i=l 

Since a is the mean direction, then on must satisfy 


n 
ESTERASE ) sini(ay =a. ):= 0 4 
tcl i fe) 


Then the statistic Ou, is the analogue of the average of the linear 


variate. 
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2.3.3 Derivation of the von Mises Distribution. 


The derivation of the von Mises distribution (or the circular 
normal distribution) applys the method of Gauss and the maximum like- 
lihood assumption to the deviations of measured atomic weights from 
integral values ('Ganzzahligkeit'). The derivation is essentially as 


follows: 


1 1s Aor voey GH denote n equally likely observations of 
a circular variate a whose true value is a» then let the ‘errors 


of observations' be 
(e) —) (er, - @ (images. eset) . 


von Mises asked for a distribution £(6,) such that the likelihood 


function is a maximum for that a given by 


bok 55501) ot sin (a, -a)== 0 


((2.3.3.1) is the condition based on the postulate of the arithmetic 


mean for circular variables). The maximum of the likelihood function 


I 
£ (a, - a) 
fl + 
occurs when 
t d 
(2.3.3.2) Nips qa 8 f(a, =a) =O . 
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Since the two sums (2.3.3.1) and (2.4.3.2) are equal for arbitrary 


values of a» then for some constant k 
(2.3.3.3) se én f(o, « a) = k sin (a, - a) 
da i i d 


This differential equation has the solution 


k cos (a, - a) 
2. 
f(a, -a)=ce 


where the two positive parameters c and k_ satisfy the condition 


that the integral of the density over the circumference is one, i.e., 


en 
sf f(@)d @=1 
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where J, (ik) is the Bessel function of the first kind of order zero 
with imaginary argument ik and I (k) is the corresponding modified 


Bessel function. Thus 
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where if we let a = O we have 


(2.3.3.4) | £ (a) = oie e 


The polar vector of this distribution is usually taken as the Cee @) 


line of polar coordinates. The probability can be written in the form 


k cos a@ 


Pla<@<a+ da] = re gay da 
oO 


so that this distribution can be considered as the error distribution 


for the circumference of a circle. 


Remarks. 


In 1953 Gumbel, Greenwood and Durand [33] chnistenéd (2.3.3.4) 
the density function of the ‘circular normal distribution' because it 
was derived in a way which is strictly analogous to the Gaussian 


derivation. 


Whereas the density (2.3.3.4) is the case n= 2 the Fisher 
distribution (see Section 2.4) as the case n= 3. (Recently 
Bingham [5] derived the analogue of (2.3.3.4) by means of projections 


for the case of n> 2 .) 


We note here that Polya [65] has proved that this distribution 


shares some properties analogous to the linear normal distribution. How- 
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ever, Kac and Kampen [43] as well as Breitenberger [7] have shown that 
some properties of the normal law cannot be obtained by any but trivial 


correspondences on the circle. 
Some properties of the von Mises distribution are? 


i) If k=0, f(a) degenerates to the uniform distribution 


£ (a) = EP; 


ii) Since, as k becomes larger, the larger part of the 
distribution is situated in the neighbourhood of the mode 
(most probable value) a, then the parameter k is a 


measure of concentration. 


iii) If k is large the distribution converges to the linear normal 


distribution (2.3.1.3) . (See Gumbel, Greenwood and Durand 


[33]. ) 


iv) 1/k is analogous to the variance of the linear normal 


distribution (see Gumbel, Greenwood and Durand [33]). 


v) The maximum likelihood estimate of k is given by the solution 


of 
t Se = 
T*(k) a I) (k) plats 
where a is defined by (2.3.2.1). (See Arnold [1].) 


vi) The maximum likelihood estimate of a is given by the’ 


solution of 
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(See Greenwood and Durand [31].) 


Arnold [1] generalized the circular normal distribution to the 
bivariate case by considering the centre of gravity of points on a sphere, 
and also investigated the 'wrapped-up normal' on the surface of a sphere. 
Breitenberger [7] has also derived the bivariate analogue of the von Mises 
distribution. Brooks and Carruthers [8] have translated a bivariate 
normal distribution into polar coordinates with the origin removed from 
the centre of the distribution and have integrated out with respect to 


the radius. 


Tables, tests and estimation procedures are available in [17], 
PaO, vl ol}, te), [33], [101]... Recentily,. Stephens [S1];-[S2), [87], [88] 
has published a series of papers with accurate tables for the calculation 
of the maximum likelihood estimate of k and a significance table for 
the null hypothesis k = 0. Much of the basic sampling distribution, 
however, involves integrals of Bessel functions which have not been 
tabulated except for the null case k=O. More recently, Downs [14], 


[15] has given a very exhaustive study of the von Mises distribution. 


The problem of generating other circular distributions around 
the circumference of a circle (e.g., the 'wrapped-up normal' and the 
Cauchy distribution) have been studied by Perrin [61], Lévy [50], 
Marcinkiewicz [55], Wintner [105], [106], and Hartman and Wintner [36]. 


Stephens [84], [85] has shown that on the circle the 'wrapped-up normal' 
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can be approximated by the von Mises distribution. The ‘angular cardioid' 
distribution as discussed in [433] and the ‘covering circle' of a sample 
from a circular normal distribution appears in Daniels [13]. Goodness- 
of-fit tests for the circle have been studied thoroughly by Watson [97], 


[98]. 


The von Mises distribution has its practical application to 
geophysical, vital and economic statistics, as shown by Gumbel [32]. 
Epstein and Sobel [20] have used a variation of this distribution in 
"life testing'. Other applications can be found in [15], [30], [31], 


[78], [101]. 


2.4 The Fisher Distribution. 


In 1954 Fisher [27] derived a distribution of directions 
known today as the 'spherical normal distribution', the 'von Mises 
distribution on the sphere' and simply as the 'Fisher distribution’. 
Before Fisher's derivation, as pointed out in Section 2.1 most 
measurements in astronomy had involved directions which could be 
approximated linearly. However, in other sciences such as geology, 
biology and meteorology there occur measurements of direction which 
cannot be usefully linearly approximated, as demonstrated by Fairbairn 
[23], Pincus [63], Runcorn [74], Tucker [91] and Waterman [92].- thus 


the need for Fisher's distribution. 


24,1 Derivatfor. 


In his paper, Fisher [27] considered the field of possible 
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observations to be the surface of a unit sphere and let the distribution of 


elementary errors over this surface have a frequency density proportional to 
kK cos @ 
e 
where @ is the angular displacement between an observed vector and a 
fixed polar vector at which @ = 0 the density is a maximum, provided 


kK is a positive constant. If © and QY are the usual spherical polar 


coordinates (so that @ = 0 is the 'North pole' axis), then 
P[Q< 6, < 6+ 40, <9, <9 + dO] = £(0,9)do ag 
where O<@<x and O<@%<2n . Since the surface area within limits 


d@ and d@ is proportional to sin 6 d@ dQ then 


£(9,8)do 4g = c e® °° ® 


sin 6 do dg 
where C is the constant of"proportionality. Since we require 
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iN phe £(6,8)de dd = 1 , 
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Cle st e +K -K 
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K 
it = 2 ix sinh k 
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where « is taken positive so that the distribution is located at the 
(north) pole (if kx <0, this gives the same distribution with polar 


direction reversed and hence need not be considered). 


2.4.2 Properties, 


If « =O the distribution is uniform all over the spherical 
surface. When « is large the distribution is confined to a small 
portion of the sphere in the neighbourhood of the maximum (the pole) 
and the distribution is approximately a linear normal distribution; for, 


by letting 


cos 9 = looney + ue- + oos 


then for large k and small 6 , the approximation becomes 


where A is some constant. 


Since large values of k lead to small dispersions, k 
is an accuracy parameter. (For an explanation of the significance of 


x in numerical terms see Watson and Irving [100], page 290.) 


If we let (,;: mm,» n,) be the direction cosines of an 
observed direction and if (A, u, v) are the direction cosines of the 


mean direction from the centre of the sphere then 
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Fisher [27] has shown that for a sample of size N the maximum likeli- 
hood estimator of (A, yu, v) is the set of direction cosines (4, m, n) , 


say, of the vector resultant of the sample unit vectors, i.e., 


N 
ae ll £, /R 
N 
agai rei 
N 


n= n, /R ; 
Nr * 


where R denotes the length of the vector resultant; i.e., 


Pe (ya 


If (A, wp, v) is estimated, then the maximum likelihood 


A s : 
estimator of k, Kk, say, is the solution of 
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When N is large and R is near N, this has the approximate 


solution 


If the pole (A, up, v) is known, and x denotes the 
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Other properties of the Fisher distribution will be given in 
Section 3.1 where these properties will be used to describe various tests 


of significance. 


2.4.4 Remarks. 


The probability density (2.4.1) on the sphere was first called 
the density function characterizing the Fisher distribution by Watson 
[94] in 1956. Although Arnold [1] in 1941 described this three dimensional 
case before Fisher did in 1953, Arnold's description is from the view 
of estimation while Fisher's work is a more thorough approach. Arnold's 
approach was to extend the Pearson [58] 'random walk' problem to the circle 
and then to the sphere. The problem can now be stated as follows: 


"A particle, initially at a given point on a circle 


or sphere, moves in an arbitrary direction along the 
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circle or along an arbitrary great circle through 
the initial point on the sphere at a fixed speed 

for a fixed time. It again chooses an arbitrary 
direction or arbitrary great circle through the 

new position along which it moves at the same speed 
for the same amount of time. This process is repeated 
many times. We let the time occupied by one unit of 
this motion approach zero, keeping the total elapsed 
time for the whole process of the same order of 
magnitude. We ask for the probability that the 
particle will be in a given section of the circle 

or spherical surface after a given total elapsed 


time for the whole process." 


Recently, Stephens [85] studied this random walk on the circumference 
of a circle, and Roberts and Ursell [71] have given a more general 


investigation of the random walk on the surface of the sphere. 


Breitenberger [7] has demonstrated why the Fisher distribution 
may be justifiably called the 'spherical normal distribution' by 
investigating the properties that are preserved under spherical trans- 
formations. Also he has extended (2.4.1) to the bivariate analogue. 

The Fisher distribution has also been derived in several physical 
theories, see, for example, Jeffreys [40] or Joos [41] who show that 
the Fisher distribution is a generalization of a two-dimensional form 
used in mechanics. Lord [52], [53], [541 also has investigated the 


spherical normal distribution. 


In Section 2.2, we have pointed out how the Brownian motion 
distribution in three dimensions is related to the Fisher distribution. 


In Section 2,4.2 the relation to the uniform distribution is given. 
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And, as we have seen in Section 2.3, the von Mises distribution is the 
two dimensional case of the Fisher distribution. The n-dimentional 
case has been derived by Stephens [81]. These higher dimensional 
distributions are shown to be derivable from lower dimensional ones 

by rotations (see Bingham [5], Downs [14] and Greenwood [30]). Downs 
[14] and Bingham [5] have demonstrated still more relationships between 


the von Mises and the Fisher distributions. 


Following Fisher's paper there appears a series of papers 
by Watson [94], [95], [96], [99], Watson and Irving [100], and Watson 
and Williams [101], in which the theory of this distribution has been 
further developed with confidence techniques being applied to inference 
problems (Greenwood [30] gives an excellent review of the distribution 
theory as well as some other distributions involving angular variables. ) 
In 1960 Roberts and Ursell [71] demonstrated that Perrin's [61] work of 
1928 and Fisher's work hardly differ numerically. Recently in a series 
of papers by Stephens [82], [83], [86], [87], [88] tables, graphs, 
nomograms were constructed for a variety of significance tests (see 


Chapter III). 


As stated earlier, certain measurements that cannot be 
usefully approximated linearly in some sciences generated this 
theory. In fact, Fisher's original paper was concerned with the 
analysis of measurements of directions of remanent magnetism in 
lava rocks. Although many applications in other sciences have been 
found, it is primarily in geology that this theory has been most 


useful, (See, for example, Fisher [27], Runcorn [74], Pincus [63], 
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Watson [94], [96], Watson and Irving [100] and Watson and Williams [101].) 
Some other papers which also use actual geological data to illustrate 

the use of the Fisher distribution are Krumbein [46], Arnold [1], and 
Selby [76] in which the problem of determining the preferred orientation 
(direction) of optical axes of specimens of crystals, rocks and pebbles 

is analyzed. The folding of a layer of rock has been analyzed by Watson 


[99] and Bingham [4]. 


Krumbein [46] has mentioned that these orientation problems 
are similar to the problem of determining the direction of neutron charges. 
Breitenberger [7] suggests problems of paramagnetism and of orientational 


polarizability in mechanics are also related. 


Fisher [27] and Krumbein [46] also mention that in astronomy 


the problem of determining the position of stars is a related application. 


In zoology, Watson [98] (as well as Pearson [60]) uses data 


from the migration of displaced birds such as pigeons as an illustration. 


We note here that although the Fisher distribution is easy to 
work with it has one limitation: it is unapplicable to distributions 
which are elliptical about a pole. Breitenberger [7] states that this 
is due to; 

" .. the fault that it contains only one parameter. 

This axial symmetry often precludes application in 

palaeomagnetic research when markedly elongated 


samples occur (for striking instances, see Watson 


and Irving [100])." 
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Since the practical applications of Fisher's distribution 


usually involve tests of some kind or another and a generalized test 


involving the Fisher distribution will be given in Chapter IV, a chapter 


reviewing the different significance tests will be given next. 


There are other distributions involving the sphere with 


densities similar to that used by Fisher. Some of the distributions 


ares 


i) 


ii) 


itt) 


the 'girdle distribution' of Selby [76] which utilizes densities 


proportional to 
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Stephens [88] has analyzed these still further and has published 


significance tables. 


the ‘equatorial distribution’ on the sphere, which involves 


densities proportional to 


K asa @ 
e 


This density was also mentioned in several connections by Arnold 
[1] and Breitenberger [7] and was studied in detail by Watson [99] 
and recently a definitive discussion of this density have been 


given by Bingham [5]. 


the "bipolar distribution' for which Krumbein [46] (as well as 
Arnold [1] and Breitenberger [7]) has suggested the densities 
depend on 
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CHAPTER III 


Practical applications of Fisher's distribution require a series 
of significance tests entirely analogous to those in current use for the 
linear normal distribution. The tests can be classified in the following 


two categories: 
a) tests of hypotheses concerning k , 


b) tests of hypotheses concerning the polar vector. 


Before stating these tests it will be necessary to have some preliminary 
distribution theory.. The two types of tests are outlined in Sections 322 


ana 5.5. 


5.1 Introductory Results. 


To the geometry of the sphere, Fisher [27] applied the argument 
of Irwin [39] and Hall [35] on the rectangular distribution and his own 
geometrical argument developed earlier (Fisher [26]) to derive the follow- 


ing functions: 
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Stephens [81] calls P(x) and Qy (x) the 'Fisher polynomials' of degree 
(N-1) and (N-2) respectively. Watson [94] introduced the notation 


<x> . Fisher [27] also defines the function 


ok) Dy(*) = TRByT Wy CH) - 


Case 1: 1 SAMPLE K O 3 


By means of the polynomials Fisher [27] derived the necessary 
sample distributions: If N is the number of observations in a sample 
from a population. possessing the Fisher distribution, and if R is the 
length of the vector resultant of these observations and if © is the 
angle between R and the x-axis (i.e., angular error) and X is the 
component of R on this axis, then the three statistics which are con- 
sidered, are R, X and c where c =cos ©. The following relations 


were derived by Fisher [27]: 


‘N N 
Peel.) X = ¥ x= iy cos 6, = R cos @ = Rc 
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where the x-axis will be the axis 6 = 0 (see also Stephens [81]); 


(3.1.5) £ (Re) = (sf —)" ce a(R)R , 
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the joint density function of R and c; 


K N a 
(5.126) £ (x) = are Ty-1)? Py(X) 


the density function of X = Re 


>] 


K N 2 sinh kR 
(321.7) f(a) = Gy ees 


the density function of R ¢ 


O(R) (N-1)! 
(3.1.8) £ (R|X) = Bre aT ae 


the conditional density of R, given X. 


Case 2: 1 SAMPLE, k= 0. 


The case of k =O is called the case of randomness or uniformity. 
From Case 1 when k #0 the form of the following density functions will 


be immediately obvious. We have 
(3.1.9) £(R) = =z Ay(R) 
oe lat 0 oN-1 N e 


the density function or R when k=O. We note here that Rayleigh 


[69] found 
2 porta 
(3.1410) £.{R) =— ¥ . ———a— sin Rx dx -. 
6) I 0 = 1 


Stephens [81] has verified that (3.1.9) and (3.1.10) are equivalent. 
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When N is large, it follows from (3.1.9) that the asymptotic density 


function of R is 


(eas = 5R° 
Bee li i) £,(R) = as 3/2 exp | ° 


Fisher [27] has derived the following functions when k = 0 : 
1 
(3.1.12) £,(R,c) = aN Dy (R)R Me 


the joint density function of R and c 


(341.13) f(x). = > 


the density function of X = Rc 


Case 3: p_ SAMPLES, k a Oo. 


Fisher [27] has generalized the results of Case 1 to any number 
of samples; Suppose we have p_ samples each from the same population 
possessing the Fisher distribution. Let N, denote the number of obser- 


vations in the ith sample, R, denote the length of the vector resultant 


of the ith sample, R denote the length of the vector sum of the 


resultants of the separate samples and 


p 
N = ” N 
ema 


Then the joint density of all the R. and R_ becomes 
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K N 2 sinh kR ft 
(3.1.14) (5 sinh 2) R ‘pl Py (Ry) 4 


the density of R is 


K a 2 sinh KR 


(3.1.15) (sete z 2, (R) , 


and the conditional density of all the R, » Siven R, is 


Th #4, 


(BR) 


(3.1.16) 


(See also Watson and Williams [101].) 


Case 4: 1 SAMPLE, k LARGE. 


The limiting or asymptotic distributions for the case of large 


k have been derived by Watson [94]. They are: 


(3.1.17) 2 « (N-X) a cS 
(3.1.18) 2 « (N-R) ~ xo 
e e tient ON-2 
2 
(3.1.19) 2 « (R-X) ~ xX, 
where rat is the chi-square distribution with 2N degrees of freedom. 


If Ris large so that R is near N, (3.1.18) follows from 


(3.1.7). Indeed, under this assumption (3.1.7) reduces to 
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N-2 
= miel i. - 
aN ps c K(N-R) (N-R) 


(N-2)! 3 


and making a change of variable from R to x = 2k(N-R) , we obtain 


AN oer ed 


) 
which is the density function of a  y variable with 2(N-1) degrees 


of freedom. Thus 


a 


2k(N-R) Xo (N-1) 


le 


Similarly (3.1.17) can be verified. 


We note here for completeness sake that for the two-dimensional 
case, i.e., for the case of the circle, the corresponding asymptotic 


distributions are 


2 « (N-X) wv xe 
2 

2 (N-R) ~ Xn 1 

2 « (R-X) ~ oe 


and can be found in Watson and Williams [101]. 


Case 5: 1 SAMPLE, LARGE N . 


We require the following approximate forms of the Fisher 


polynomials P(x) and Quy (*) defined in (3.1.1) and (3.1.2), when 
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N is large, 
2 
R 
py rnin: oto! (eens 
\ a " 
Vt VN 
2 
Q. (R) ~ n-2): oN-t aie ee 
a NERS 
Also we have 
£ P(x) = - (N-1) Q(x) 
dx N 3 N 
2 
QN-l 36 ‘ ae 
O,(R) aay: IE Re 


(See Stephens [81].) From Case 1, i.e., from (3.1.6) and (3.1.7), the 


approximation for the density function of R_ becomes 


: ep 
K N ) ON 
£ (R) ~ ( ) R sinh (KR) e : 

K sinh k Apo NO 2 
and the density function of X becomes 

Je KX - x 
£ ,(X) ak a : ¥ = ¥ 
Ax VN 
o 
nee Oe? 18 


while the asymptotic distribution for large N and 


3 2 
(3.1.20) ne 2 Xx 


young 


' Oo th nine tabie 4 ‘eos oer 
7 (1-1) - = agg) 


i hein ff aes eee ae 7, 
oft .(f. ‘ak baa (3.1.8) mort ,.9.t .f seed moxt (,[18) ene > ee 
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This has been concisely proven in Stephens [81] and provides an approximate 
test for randomness (see Section 3.2.2). Also if k=O and R is very 


close to N such that N-R<2,_ then (4.1.8) becomes 


(3.1.21) fo hee ee Ne ae 
oe (N-2)! ook 


(see Stephens [81]). 


362 Test of Hypotheses Concerning k . 


52201 Exact Tests for kK. 


Watson [94] has suggested that when the polar vector of the 
population is known, an exact significance test of k can be achieved 
by using (3.1.6) since X = Rc is a sufficient statistic of k (see 
proof in Stephens [81] page 78). The construction of confidence intervals 
for k when the polar vector is known is given in detail by Stephens [88] 


by using significance tables. 


Se2er Tests of the Null Hypothesis k=O. 


Since when k = 0 the density function of the Fisher dis- 
tribution is constant, then the null hypothesis k=O is the hypothesis 
of randomness (or uniformity). Bruckshaw and Vincenz [9] suggested that 
the value of the density of R , with k =0O , computed for the observed 
value, be compared with the modal value of the density of R (with x=0O). 
However it is Watson [94] who first provides a statistical test of random- 


ness. It is based on the following argument; 
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"Given a sample of size N, the length R will 
be large if the sample shows preferred direction 
and small if it does not. Assuming that there is 
no preferred direction (Lee k = 0) a value 

Ry » say, may be calculated which will be exceeded 


by R with any stated probability." 


Thus an exact test may be made of the hypothesis that a sample of N 


vectors is randomly distributed by finding the probability that R> Ro $ 


N 
P[R>R)] = He £,(R) dRie a .. 
Ro 


where, for the Fisher distribution, a may be explicitly calculated 


by using the Fisher form of £,(R) given in (3.1.9). 


In a subsequent paper Watson [95] gave this test and included 
significance points of Ry for various probabilities and sample sizes: 
a= 5% and a=1%, N=5 to N=20. Stephens [81] recomputed 
this table and extended a to four values: 1%, 2%, 5%, and 10% for 
N= to N=20. (See his Table 2.2.) Thus to carry out the test 
it is merely necessary to enter Stephens' table at the row corresponding 
to the number of observations in the sample in order to find the value 
of Ry which will be exceeded with a given probability in sampling 
from a population in which kx = 0. Stephens [86] recently has extended 


his table to include values of N=4 to N= 25. He describes the 


exact test for randomness as follows: 
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ii) Find Ry in the table for appropriate N and a. 


1) ott ok Ry » reject the hypothesis « = 0 at the a-significance 


level. 


When « is large there also exists an approximate test for 


randomness. It was first given by Watson [95] by using the form 


ike 


WIS 


- 

3 

(see Equation (3.1.20)). Stephens [81], [82], has established this 
same result but his derivation uses the multi-dimensional form of the 


Central Limit theorem (see Cramér [12]). The approximate test for 


randomness for large N , given by Stephens [86] is: 


i) Find R. 


ii) Find Ry from the equation 


N 2 
Ry = 3 Xs (a) 


where 2 (a1) is the a-significance level of the chi-square 


distribution with 3 degrees of freedom, upper tail. 


£0) Tf Rk > Ry » reject the hypothesis « =O at the a-significance 


level. 


3.2.3 Tests of k= x 


Watson [94] has suggested that an exact significance test of 


the null hypothesis k = Ko may be made by using the density of R in 
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(3.1.7) with k« = *. when the alternative hypothesis is k>k 


0) yes 


As an approximate test, when k is large, we use (4.1.17) and 


(3.1.18), that is, 


2 


Non 


2 « (N-X) 


le 


2 « (N-R) aren : 


le 


Hence an approximate test of the hypothesis «k = Ky can be made by 
referring 2k, (N-X) or 2k, (N-R) to 1° tables with 2N and 2(N-1) 


degrees of freedom respectively. 


Stephens [87] has constructed significance tables for the exact 
and approximate tests of k = Ko both when the polar vector is known 


and unknown, 


The derivation of the exact tests in higher dimensions for this 


case can be found in Stephens [81]. 


3.2.4 Tests of Several x's . 


Watson [94] first suggested that for a comparison of two k's 
an F-test may be used because 1/k corresponds to the variance of the 
distribution. Watson and Irving [100] state that if samples of N, and 


N, observations give dispersion estimates ky and Kk, then 


k variance with 2(N, - 1) degrees of freedom 


— 


(3.2.4.1) KS ~ variance with 2 N,-1) degrees of freedom 
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assuming the two populations have the same value of k« . Thus this 
assumption may be tested since the right hand side of (3.2.4.1) has 


a F-distribution and values of 


far from unity suggest that Ky # Ky 


For the general case, Watson and Irving [100] have suggested 
that for several populations the ratio of the largest to the smallest 
estimates may be used to test the hypothesis that k is constant over 
the populations. Watson and Williams [101] suggest that since the test 
of homogeneity of the values of xk for several results requires the 
sample resultants, then (3.1.16) would be the basis for any exact 
significance test for this case. Also, they state that for the case 
of known polar vectors, there is no practical interest and they do not 
pursue this case further. For the case of unknown polar vectors they 
do suggest that (3.1.16), being independent of k , could be used to 
construct an exact test not depending on the nuisance parameter k . 
However, when both k and N, are small they state that nothing is 


i 


known of suitable tests. 


3.3 Tests of Hypotheses Concerning Polar Vectors. 


facet A Single Prescribed Polar Vector. 


Fisher [27] has shown (see Watson and Williams [101] 


for a numerical demonstration) that a test of a prescribed 
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polar vector may be made by using 


o1 
(BSc lel) l-c e-SeBd) -1 | ; 


where P is the probability that the cosine of the angle between the 
resultant and the polar vector is less than c . Thus for example, for 

a given sample, R can be calculated and for a radius of confidence at 
the 5% level of confidence we use (3.3.1.1) where N and R are now 
known and 1/P = 20 so that @ is determinable. This value gives a 
cone of confidence for the polar vector of the population from the 
sample. This test is analogous to the 'Student' [90] t-test for a single 


sample (see [27] and [94]). 


When « is large and N-R<2, Fisher [27] has shown by 
example that this method can be used as a fiducial test of the hypothesis 
that the polar vector of the population from which the sample is drawn 


has a prescribed direction. 


Watson and Irving [100] give the approximation of (3.3.1.1) 


as 


where c= cos @ and k is the best estimate of k and is given by 


Watson [94] suggests that, for the case of a given polar vector 
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prescribed by hypothesis, the test statistic 


(N-1) R(1-c) F 


-Cc 
N-R ~ “2,2(N-1) 
should be used, 


Watson and Williams [101] have developed tests which do not 
depend on «: If « is known, a test of a prescribed polar vector 
may be made by (3.1.6) and when « is unknown the test is the analogue 
of the single sample 'Student' t-test and the density of R_ given 


X = Re is considered. 


Stephens [83] reviewed these previous results and provides, 
by means of nomograms, an exact test for the null hypothesis that a 
given vector is the polar vector of the distribution when k is unknown, 
He discusses two approximate tests for this hypothesis as well as an 


approximate test that the polar vector lies in a given plane. 


Stephens [88] has also constructed nomograms for the case when 
k is unknown for exact and approximate tests of the hypothesis that a 


given vector is the polar vector. 


Sada Comparison of Iwo Polar Vectors. 


Watson [94] considered samples of size Ny and N,, drawn 
from two populations and assumed that both populations have equal values 


of « and then suggested that the statistic 


* ptzatsete ; 2 eteortogy 
4 a i 4 < 7. eS, 


t shears Spun nm 


in'® ae pants cghledndens 


me ty 
jon ob folrw etesd beqofsvab a “dads ems ft 
vara | 
ee aa 


tolscsv telog bediroes7q 8 to seed 8 


¢)\ aur dew ws 
maeptentulvertadie sont: agp 


: vy : Sf ° SENG seetnit! ’ 
nevig 4 io ysheasb sffd bos seeeea 
=f) owe, Ta 7. 


Le | 


,a9bivorg bis etlueex evotverg seas ala 

8 tadd3 standaoqud. Ikuna edd 102 tes2 sadue. ns . ne 

* ov tel a. 

.nwominu ef nssiu mabsudtavesh ect Yo nod0e fen my ob 

me es Liew en ateodiomed etd x08 adaes | 

.snsiq savig s at eer! ro22er > sans ooo Fp. aaa ett 

7 ou om ee: yy? ie 

nerlw sas. sit rot sosrgomon bossu1san09 Se ns aes me hy 
& Jali akeerltoqyd sdt is ceede | a 


4 


ip sia Hox fe 


en ¢ - 
a Nt ; bile) § 7 Ve 7? . 
A . 


names ooh er. Bate . i 
cet iad Hina ee aed 
nex M bas Mf osha 2 he eo ee 


eoulsv Isups sved. wnat aes 


Cid: fs 
‘ Ai i sy ig ; ¢ , 
it Lh A ae a ag: ale 


- 47 - 


(R +R, -R) 
(N-2) : ~ F 
(N-R, -R,) — ~2,2(N-2) 
provided a significance test of the hypothesis that the polar directions 
of the populations were identical (because if the mean vectors are very 
different, R, +R, will be much greater than R_ so that large F 


1 2 


indicates significance). 


ele D Test for Several Polar Vectors. 


mene oo ee eee eee 


To test the equality of several polar vectors for different 
samples, assuming that all populations have the same « , Watson [9}] 
suggested that this test would be similar to that of Section 3.3.2 
above. The generalization to p populations has been derived by Watson 


and Irving [100]. They suggest the following test statistic: 


- i Pp 
oe —— ~ F 6) N, - P) ? 
2(p-1) Pp p 2(p-1),2 i=1 1 
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where the sample from the ith population is of size Ny and has 
resultant length Re and R is the length of the vector sum of the 
resultants of the separate samples. Here, large values of F would 


suggest that the assumption of identical polar vectors is false because 


the algebraic sum of the sample resultants 
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will then be much greater than the length of their vector sum R. 


This same generalization can be found in Watson and Williams 
[101]. Moreover, they derive a test which does not include the nuisance 
parameter « . They use the fact that since (3.1.16) is free of k 


it provides a possible exact test. 


3.3.4 Tests for Coplanarity. 


The problem of testing whether three polar vectors are coplanar, 
that is, that the ends of the three vectors all lie on the same great 
circle, was first suggested by Watson [94]. He remarks that the method 


used in the generalization of Section 3.3.3 above would apply but 


",.. the estimation equations are no longer directly 


soluble", 


In a subsequent paper, Watson [96] derived a test of coplanarity 
for p populations by another method, that of likelihood-ratio methods, 
Because a generalization of this test will be given in Section 4.2 using 
the same method but slightly different notation, a brief outline of 


Watson's test of coplanarity is given in the next chapter. 
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CHAPTER IV 


THE TEST OF COPLANARITY 


The problem of testing whether a set of vectors all lie in the 
same plane normal to a given direction is a problem of deriving a test 
statistic for this case of coplanarity. Watson [96] solved this problem 
by considering several populations each distributed according to the 
Fisher distribution on the unit sphere. In this chapter Watson's test 
of coplanarity is rederived. This is done in Section 4.1. The general- 


ization of this test is derived in detail in Section 4.2. 


We remark here that if we have several populations (each of 
which is distributed according to the Fisher distribution) distributed 
randomly on the surface of a (unit) sphere and if a is the angle between 
a known direction and the vectors representing the population means, 
then Watson's case is that of a=x/e . The generalization extends 
the test statistic to values of a from O to x (i.e., O<a<n). 
Geometrically, the problem is now a problem of testing whether a set of 
vectors emanating from the centre of a sphere form a right circular 
cone for which the vertex is the origin of the sphere and the semi- 
vertical angle is a . The contour on the surface of the sphere will 
be a circle. For the particular case a = 1/2 the cone becomes a plane 


through the centre of the sphere and the contour is a great cireie, 


As remarked earlier many geologists have used this distribution 


on the sphere as the natural mathematical model for the surface of the 


ane : ; ‘ ‘os ve 
4 . wu) i hg /jav O. avs vad eed ‘ 


wy , ; | 
¥ ; v , ey w y 
ae 5 ; 
q 4 
i! 
F j=) 
: i} 


| 11% 
Pa” | Pol 24% P5348 AY he beaded Fee 
on3 mt ail Iie azotosv to 49a s resi3erw “ie ott a 
| Lae Lee Se ae “a hid % 
Je93 8 gniviteb to msidox 8 el solsoerth. novig f 


meidexq eid3 bevioa [3] noaiaW .yitremalqes to | 


of} 03 gnitbroos8 besudtarsetb lose eagttefuqog a vee grt 

1a23 a'nosieW re2gedo eidd eI .exedqe stay. orld no motsudts30 2 
~texansg od .1.4 nobios® oi each et ater. iabiadeorncties | 
Sd noksoed at Stateb ab hevkseh a2 3 

= tn i hn 
beiudtrseth (notzudtazeth tedeit od2 ot setsecon baad a Wet 


asswied olgns ods at 0» 24 bas Srasiqa (stow) 8 to Batompaves 
<easem molisiugog odd gikimessiqet erotaev ord aah 
ebnsixs noltesilsrsdsg sd . - S\n = 9 Ro sada 
-(m>0>0 wot) m os @ mort» 26 eeulev.o: ae 
to 92 8 1edtedw aoijeas io meidorq 6 | * efdox a a 
reluotho. tat ¢ mio} exedqe © 20 extn Sateclinaas eR: 
-§mae sri) bas onsdgn si3\20 mkgtio os eda dotdw 20% snao. 
one mn oa ts 
onsiq s asmoosd soo oA¥ S\r = ris sirs ghee wh 
AR Asin’ om 


a Boag 9: 


ovis 20 stn a le th 


motiudtijeth aida ton 


-=561 - 
earth, 


4.1 Watson's Test of Coplanarity. 


Watson [96] considered p populations randomly distributed 
on the surface of a unit sphere where each population was distributed 
according to the Fisher distribution. The probability density function 
for each population can be written in the form 
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where (4,, m, 


i ? n,) are the direction cosines of an observed direction 


from the ith population and (A, » My» V4) are the direction cosines 
of the mean vector of the ith population. If a sample of size Ny 
is taken from the ith population (i = 1,2,...,p) and if we let 

N = ne N. and (2, mM n,) be the direction cosines of the vector 


resultant of the sample observational vectors from the ith population 


and R. the length of this resultant vector, then 
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where the (4, m5, ns) j= 1,2,...,N, are the N, vectors in the 


sample from the ith population so that 
COS GO =e. A} 45m Wee oy, 
bib @, jaig Woda Pind 


Now for each i (i =1,2,...,p) the logarithm of the likelihood, except 


for a constant term, is 
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Thus for the entire set of observations the logarithm of the likelihood, 


except for a constant, is 
Pp 
(4.1.1)  N(én « - gn sinh k) +k yee R, (4A, + My, + n,v,) : 


Watson's test of coplanarity is essentially as follows: if a 


known direction (A,yu,v) is orthogonal to all the population means, then 
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the maximum likelihood (M.L.) estimator of (A; su,v,) is 


(4,-A cos 955 m 


A - cos 6,, n,-yv cos @ 
(4.1.2) (Ng otgs?,) = ( : o a Z ) 


sin 6, 
i 
where 


cos Q, = LN + Mp + nv. 


Under these conditions the estimator of k is given by the 


solution of 


p 
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If no restrictions are placed on the mean vectors, then 
(2, .m, >, ) is the M.L. estimator of (A, 55 5V;) and the estimator of 


K is the solution of 


Pp 
* eL 1 
(EPA, ) coth k - * =F y R, 


(The results (4.1.2), (4.1.3) and (4.1.4) will be verified in Section 4.2.) 


Watson's method to test the null hypothesis that all the mean 
vectors lie in a plane normal to the prescribed direction (A,u,v) requires 
Wilks' Theorem. It can be stated roughly as; 

When N is large, under the null hypothesis the 


distribution of -2L is approximately a Xx 
distribution with degrees of freedom depending 
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on the number of parameters, where L is the 


logarithm of the likelihood-ratio. 


(See Keeping [44], page 136). Thus Watson's method is the likelihood- 


ratio method. 
In this case 
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Similarly it can be shown that 
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zs> 


Te 


(4.1.6) 


and from (4.1.4) we obtain 


= P 
‘ j », i 
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Substituting (4.1.6) and (4.1.7) into (4.1.5) and simplifying we obtain 
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kK is large, the dispersion is small so that 
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ey is a very small number. Since 
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Thus (4.1.8) reduces to 
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Hence for large N (4.1.9) provides an approximate test of the hypothesis 


(51.9) N 


that a set of p mean vectors lie ina plane normal to A, 


The same likelihood-ratio procedure gives, as a test of 


coplanarity only, the statistic 


A'U A 
(4.1.10) N A ~ if 0 
Dy 4 Can Bi) 
fal a L 


where the value of A minimizing A' UA is the latent vector of U 
corresponding to its least latent root and this vector is the M.L. estimator 


of the normal to the best fitting plane. 


4.2. The Generalization of Watson's Test of Coplanarity. 


ee 


Using the same notation as in Section 4.1, we now consider the 
general approach to this test: suppose a known direction (A,u,v) makes 


an angle a (0 <a<x) with all the population means (A, Myo ¥;) : 


The M.L. estimator of (Aj My oY, ) is found by using Lagrangian 


multipliers. The logarithm of the likelihood of the entire set of obser- 


vations (Equation (4.1.1)) becomes (except for a constant) 
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Pp 
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where the constraints are 


MA, + HH, + Vv, = cosa , 


(p.2.2) 


x “ he + ve = ] 
i i i 
and {n, 3 and {é,3 are Lagrange multipliers. 


By taking the partial derivatives of (4.2.1) with respect to 


Ay HM, and v, (for some i-=1,...,p) and equating the result to 


zero we obtain 


(4.2.3) K Ryo, + 2n, A, + 6,A = 0 
(4.2.4) KR, m, +2n, wy + EM = 0 
(4.2.5) KR, 0, + 2ny ¥, +6, = 0 


Multiplying (4.2.3) by A, oP (42. ad by MW, and (4.2.5) by v, and 


adding the three equations we obtain 
(2.6) KR, CAL +My, + n.v,) + 2n, +€, cosa = 0 


Multiplying (4.2.3) by A, (4.2.4) by wp, and (4.2.5) by v, and 
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adding we obtain 


(ies 7) KR, (2,4 + ™,u + nv) + 2, cos a + an = 0 


From (4.2.6) and (4.2.7) by elimination 
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Mittiplying (4.2.3) by 2.., (4.204) by) mp, , and (4,2.5) by nu, and 
al i i 


adding we obtain 
(4.2.10) KR, + an; (2,9; + MW, + n,v;,)+ 6, (4,4 + MH + n,v) = 0 
Using (4.2.8) and (4.2.9) this reduces to 


ume 
sin a+ 2 cos @ (4, + MH + nv) (25a; + mp, + n.v,) 


2 2 
- CAL 0D iat nv, ) - (4,4 +My + nv) = 0 
(provided x #0 and R, #0), which is of the form 


2 
[£,4, + my, n,v,] + B(L,A, + my, + niv;) +C =0 


PHT: i 
where B= - 2 cos a (2,4 + My + n,v) 
2 2 
C= (£,A + mp + nv) - sin a. 


notjanimile yd (7.8. 1), baw (3.3.4) a | a 


[(w ya + yor + Aya) - (4.9 + i 


bas jn yd (2.S.a) bas, 


- O-= Ay M+ yy + As) 53 ae + wet 


“ayn + ye il “nem 


otk: 


ck why pedeeey 


a 
+ 
an 


‘eat : 
theses 


a 
O) on 


Aa) yee oe “@ 8 


i 


rs : 


, ae A) ante 


(nie se rane seals 


= bce 


Since this is a quadratic equation, we have 


£ + Mp, + OV, 
(4.2.11) 
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Now (4.2.8) becomes 
€. = « R, [cos a cos 8B, - cos @,] iy atte a 
i i bi i 
= « R, [cos a cos (a - 6,) - cos 6,] / ireP a 
i pipe 
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i.e., 
(4.2.15) ei = « R, sin (0, -a@) / sine , 
and (4.2.9) becomes 
fet. 
an, = K R, [cos & cos 0, - cos Bj] / sin’ 
KR, 
= sa [cos @ cos 05 - cos @ cos oO. - sin a sin 6,] F 
sin a@ 
138, 
sin 0, 
(4.2.16) Qn, =a SR Sin” 


and substituting these results into (4.2.3) we obtain 
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The results (4.2.17), (4.2.18) and (4.2.19) define the M.L. 


estimator of Cy under the null hypothesis that the known 


Me 


(A,u,v) makes an angle a with all the population means. 


Under this null hypothesis the M.L. estimator of k can be 
found by differentiating (4.2.1) with respect to k« and equating the 
result to zero. The equation that the M.L. estimator of k must satisfy 


is 


ra P 1 P 
(4.2.20) coth k - . R, cos 8, == ) R, cos (a - @,) 
: 1, i, N , i i 
i=] i=1 
When no restrictions are placed on the mean vectors, the resultant vector 
(Z,.m,,n,) is the M.L. estimator of (A; Kyo ¥,) . Thus under the alter- 


native hypothesis, the equation that the M.L. estimator of k must 


satisfy is 
* 1 1 
coth kK -—y= Te R, (4,2 + mm, + n,n, ) 
K i=l 

(4.2.21) 
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We now proceed in the same way as in Section 4.1. The logarithm 


of the lLikelihood-ratio can be written as 


A * 
L= gng, (6,8) - dng, (6,58 ) 
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where fn g (6,5) is the expression (4.1.1) considered in relation 
to general a (hence the subscript a ). Under the null hypothesis 


gn g (6,54) becomes 
(4.2.22) sng (6,,%) 
Ch: eh 
A vege hark Va A A A 
= N(fn k - gn sinh k) + k ) R, (4 A; +My, +n v,) 
where by (4.2.17), (4.2.18) and (4.2.19), 


A A 
LA; +My, + 1yV, 


[sin a .4 (4, + Mu + nyv) sin (6, - a)] 


sin @, 
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sina + cos 6, (sin @, cos a - cos 6, sin a.) 


sin 6, 
i 
= sina sin 60, + cos a cos @, 
i 1 
= cos (@, -a) , 
i 
so that (4.2.22) becomes 
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(4.2.23) gn g (0,58) = N(én'k - gn sinh k) +k ea R, cos (6, - a) 


When k« and N are large the dispersion is small so that 
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Thus we have 
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When k is large we also have 
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Then by (4.2.20) the approximate estimate of k under the null hypothesis 
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and by (4.2.21) the approximate estimate of kk under the alternative 


hypothesis is 
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Substituting (4.2.28) and (4.2.29) into the left hand side of (4.2.27) we 
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where a and 6, are defined in (4.2.2) and (4.2.13) respectively. 
The statistic (4.2.30) can be used as a significance test of the null 
hypothesis that the population mean vectors make an angle a (O0<a< zx) 


with a prescribed direction (A,u,v) , provided «, is large. 


We note here that if a =nx/2 the statistic (4.2.30) reduces 


to 


P 
) R, (1 - sin 6) 


i=l 
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which is (4.1.8), demonstrating that Watson's statistic is indeed the 


ON 


particular case of a=n/2 . 
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The cases « = 0 and R = Q are degenerate trivial cases, 


Another approximate form can be obtained when 


this case the dispersion is small so that we can let 


Oo, ene =e, 
Then 
2(1 - cos (a - 6,)) = 2(1 - cos é,) 
2 4 
= ¢€, + O(e,) 
and 


ie 2 Mm 
sin (a - 6,) 0h. O(e;) , 


so that (4.2.30) has the further approximate form 
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is large. 


In conclusion, either (4.2.30) or (4.2.31) provide a test 


statistic for the hypothesis that the ends of the mean vectors of p 


populations all lie on a circular section of the unit sphere when the 


populations are distributed according to the Fisher distribution. 
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4.3 Remarks on Further Study 


To obtain an analogous statistic of (4.1.9) we must minimize 
(4.2.30) with respect to A,u,y and a subject to ne + na + a fol . 


This is equivalent to minimizing the expression 


3 ee egy 
y R,{1 - cos (a -6,)] (subject—to- A Fu o+ vy. = 1) 
gk 


with respect to these four parameters. This has been done by restricting 


a (O<a<x) to a#/x/2 and obtaining 


and a system of three equations that the minimizing values of A, uw and 
v would have to satisfy. However, the form of this system does not allow 


for further reduction. This problem at present is an open question. 


The statistic (4.2.30) is useful for the case of known a . The 


problem of obtaining a test statistic for unknown a could also be 


investigated. 


The statistic (4.2.30) is an approximation because N (as well 


as each N and « were assumed to be large. Without these assumptions 


A 
it seems possible that an exact test could be derived. For small kk, _ the 
estimation equations for k in the case of @ = 1/2 (as well as in 


the general case) are difficult to reduce to a suitable workable form. 
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Only a very complicated form of L is possible and a satisfactory form 


has not yet been derived. 


For the case a@ = #/2 an analysis of variance analogue for 
the statistic (4.1.9) has been suggested by Watson [96]. It has the form 


N- A'UA 
(4.3.1) ate 


eee F 
P oi = “p,2(N-p) 
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For large N this is the same as (4.1.9). Thus (4.1.9) and (4.3.1) 
would provide two approximate tests of the hypothesis that a set of p 
mean vectors lie in a plane normal to A . Watson has also shown by 
a numerical example that (4.3.1) could be conveniently arranged in an 
analysis of variance table. The problem of deriving an analysis of 
variance analogue for the generalized test statistic (4.2.30) should 


also be pursued, 
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